Search CORE

94 research outputs found

An Empirical Study on Android-related Vulnerabilities

Author: Bavota Gabriele
Escobar-Velasquez Camilo
Linares-Vasquez Mario
Publication venue
Publication date: 11/04/2017
Field of study

Mobile devices are used more and more in everyday life. They are our cameras, wallets, and keys. Basically, they embed most of our private information in our pocket. For this and other reasons, mobile devices, and in particular the software that runs on them, are considered first-class citizens in the software-vulnerabilities landscape. Several studies investigated the software-vulnerabilities phenomenon in the context of mobile apps and, more in general, mobile devices. Most of these studies focused on vulnerabilities that could affect mobile apps, while just few investigated vulnerabilities affecting the underlying platform on which mobile apps run: the Operating System (OS). Also, these studies have been run on a very limited set of vulnerabilities. In this paper we present the largest study at date investigating Android-related vulnerabilities, with a specific focus on the ones affecting the Android OS. In particular, we (i) define a detailed taxonomy of the types of Android-related vulnerability; (ii) investigate the layers and subsystems from the Android OS affected by vulnerabilities; and (iii) study the survivability of vulnerabilities (i.e., the number of days between the vulnerability introduction and its fixing). Our findings could help OS and apps developers in focusing their verification & validation activities, and researchers in building vulnerability detection tools tailored for the mobile world

arXiv.org e-Print Archive

Crossref

Using Structural and Semantic Information to Support Software Refactoring

Author: Bavota Gabriele
Publication venue: Universita degli studi di Salerno
Publication date: 20/07/2015
Field of study

2011 - 2012In the software life cycle the internal structure of the system undergoes continuous modifications. These changes push away the source code from its original design, often reducing its quality. In such cases refactoring techniques can be applied to improve the design quality of the system. Approaches existing in literature mainly exploit structural relationships present in the source code, e.g., method calls, to support the software engineer in identifying refactoring solutions. However, also semantic information is embedded in the source code by the developers, e.g., the terms used in the comments. This research investigates about the usefulness of combining structural and semantic information to support software refactoring. In particular, a framework of approaches supporting different refactoring operations, i.e., Extract Class, Move Method, Extract Package, and Move Class, is presented. All the approaches have been empirically evaluated. Particular attention has been devoted to evaluations conducted with software developers, to understand if the refactoring operations suggested by the proposed approaches are meaningful from their point of view. [edited by Author]XI n.s

EleA@UniSA - Università degli Studi di Salerno

Towards Automatically Addressing Self-Admitted Technical Debt: How Far Are We?

Author: Bavota Gabriele
Di Penta Massimiliano
Mastropaolo Antonio
Publication venue
Publication date: 17/08/2023
Field of study

Upon evolving their software, organizations and individual developers have to spend a substantial effort to pay back technical debt, i.e., the fact that software is released in a shape not as good as it should be, e.g., in terms of functionality, reliability, or maintainability. This paper empirically investigates the extent to which technical debt can be automatically paid back by neural-based generative models, and in particular models exploiting different strategies for pre-training and fine-tuning. We start by extracting a dateset of 5,039 Self-Admitted Technical Debt (SATD) removals from 595 open-source projects. SATD refers to technical debt instances documented (e.g., via code comments) by developers. We use this dataset to experiment with seven different generative deep learning (DL) model configurations. Specifically, we compare transformers pre-trained and fine-tuned with different combinations of training objectives, including the fixing of generic code changes, SATD removals, and SATD-comment prompt tuning. Also, we investigate the applicability in this context of a recently-available Large Language Model (LLM)-based chat bot. Results of our study indicate that the automated repayment of SATD is a challenging task, with the best model we experimented with able to automatically fix ~2% to 8% of test instances, depending on the number of attempts it is allowed to make. Given the limited size of the fine-tuning dataset (~5k instances), the model's pre-training plays a fundamental role in boosting performance. Also, the ability to remove SATD steadily drops if the comment documenting the SATD is not provided as input to the model. Finally, we found general-purpose LLMs to not be a competitive approach for addressing SATD

arXiv.org e-Print Archive

Toward Automatically Completing GitHub Workflows

Author: Bavota Gabriele
Di Penta Massimiliano
Mastropaolo Antonio
Zampetti Fiorella
Publication venue
Publication date: 06/09/2023
Field of study

Continuous integration and delivery (CI/CD) are nowadays at the core of software development. Their benefits come at the cost of setting up and maintaining the CI/CD pipeline, which requires knowledge and skills often orthogonal to those entailed in other software-related tasks. While several recommender systems have been proposed to support developers across a variety of tasks, little automated support is available when it comes to setting up and maintaining CI/CD pipelines. We present GH-WCOM (GitHub Workflow COMpletion), a Transformer-based approach supporting developers in writing a specific type of CI/CD pipelines, namely GitHub workflows. To deal with such a task, we designed an abstraction process to help the learning of the transformer while still making GH-WCOM able to recommend very peculiar workflow elements such as tool options and scripting elements. Our empirical study shows that GH-WCOM provides up to 34.23% correct predictions, and the model's confidence is a reliable proxy for the recommendations' correctness likelihood

arXiv.org e-Print Archive

Mining Version Histories for Detecting Code Smells

Author: Bavota Gabriele
De Lucia Andrea
Palomba Fabio
Poshyvanyk Denys
Publication venue: W&M ScholarWorks
Publication date: 01/01/2015
Field of study

Code smells are symptoms of poor design and implementation choices that may hinder code comprehension, and possibly increase change-and fault-proneness. While most of the detection techniques just rely on structural information, many code smells are intrinsically characterized by how code elements change over time. In this paper, we propose Historical Information for Smell deTection (HIST), an approach exploiting change history information to detect instances of five different code smells, namely Divergent Change, Shotgun Surgery, Parallel Inheritance, Blob, and Feature Envy. We evaluate HIST in two empirical studies. The first, conducted on 20 open source projects, aimed at assessing the accuracy of HIST in detecting instances of the code smells mentioned above. The results indicate that the precision of HIST ranges between 72 and 86 percent, and its recall ranges between 58 and 100 percent. Also, results of the first study indicate that HIST is able to identify code smells that cannot be identified by competitive approaches solely based on code analysis of a single system\u27s snapshot. Then, we conducted a second study aimed at investigating to what extent the code smells detected by HIST (and by competitive code analysis techniques) reflect developers\u27 perception of poor design and implementation choices. We involved 12 developers of four open source projects that recognized more than 75 percent of the code smell instances identified by HIST as actual design/implementation problems

College of William & Mary: W&M Publish